QOGMP: QoS-oriented global multi-path traffic scheduling algorithm in software defined network

According to the research status of Software Defined Network (SDN) control layer traffic scheduling, we find the current common problems, including single path, easy congestion, Quality of Service (QoS) requirements and high delay. To solve these four problems, we design and implement a QoS-oriented global multi-path traffic scheduling algorithm for SDN, referred to as QOGMP. First, we propose a link weight calculation algorithm based on the idea of traction links and deep reinforcement learning, and conduct experimental verifications related to traction links. The algorithm considers QoS requirements and alleviates the problems of easy congestion and high delay. Then, we propose a traffic scheduling algorithm based on link weight and multi-path scheme, which also considers QoS requirements and solves the problem of single path. Finally, we combined the link weight calculation algorithm and the traffic scheduling algorithm to implement QOGMP, and carried out comparative experiments in the built simulation environment. The experimental results show that QOGMP is better than the two comparison algorithms in terms of delay and rescheduling rate.


Scientific Reports
| (2022) 12:14600 | https://doi.org/10.1038/s41598-022-18919-w www.nature.com/scientificreports/ The design concept of SDN is to separate the control layer and data layer of the network while realizing programmable control, which can provide centralized management and dynamic maintenance capabilities for distributed networks 7,8 , thereby effectively solving the disadvantages of the traditional IP network in maintenance, expansion, and experimental innovation. The typical architecture of SDN is divided into three layers 9 . The top layer is the application layer, including various services and applications. The middle layer is the control layer, which is mainly responsible for processing the data resource arrangement and maintaining information such as network topology and status. The main body of the control layer is a logically centralized and programmable controller that can master global network information, which is convenient for operators and scientific researchers to manage and configure the network and deploy new protocols. The bottom layer is the data layer, which is responsible for data processing, forwarding, and status collection based on the flow table. The main body of the data layer is a lot of dumb switches (different from the traditional two-layer switches, specifically refers to the equipment used to forward data). These switches only provide simple data forwarding functions and can quickly process matched data packets to meet the increasing demand of traffic. An open unified interface (such as OpenFlow 10 ) is used to interact between the control layer and the data layer. The controller issues unified standard rules to the switch through the standard interface, and the switch only needs to perform corresponding actions in accordance with standard rules. Different from the "slice" management of the traditional network 11 , the control layer can use the global network view and dynamic rule configuration capabilities provided by SDN to perform load balancing and flexible traffic scheduling 12 . This solves the limitations of traffic scheduling based on the traditional network to a large extent, thereby maintaining network availability and improving network performance. Therefore, it is of great significance to carry out research on SDN-based control layer traffic scheduling methods.

Research status and content
Research status. In recent years, many scholars have devoted themselves to the study of SDN-based control-layer traffic scheduling methods. These researches are dedicated to solving different problems, including four problems such as single path, easy congestion, QoS requirements, and high latency.
(1) Single path. The means to solve the single-path problem in literature [13][14][15][16][17] are all multi-path transmission. For example, literature 13 proposed an equal-cost multi-path (ECMP) scheme, which is currently widely used. However, this type of solution has two major problems: one is that the multi-path transmission scheme implemented under specific conditions lacks versatility in the SDN environment; the other is that traffic scheduling can only use local information, which is likely to cause congestion problems. (2) Easy to congest. The traffic in the network has shown explosive growth. The traditional network architecture cannot achieve flexible, fast and effective scheduling of network traffic. In addition, it is difficult to know the load status of the path. Congestion problems are prone to occur, resulting in low link utilization. The means to solve the problem of easy congestion in literature [18][19][20][21][22][23] include processing elephant flows and using SDN global link load information. These methods introduce additional overheads such as query detection and congestion calculation, which increase the forwarding delay to a certain extent. (3) QoS requirements. When the network is overloaded or congested, QoS can ensure that important services are not affected by delay or packet loss during the transmission process 24 , while ensuring the efficient operation of the network. At present, SDN can provide QoS guarantee through mechanisms such as flow control and bandwidth reservation, but it is difficult to meet the increasing demand for QoS of business applications 25 . A real QoS-oriented traffic scheduling scheme is needed. The main means to solve the QoS problem in literature [26][27][28][29] is to introduce user-defined constraints and comprehensively consider the link occupancy rate and the size of the business flow. These methods also introduce additional storage overhead such as packet loss rate measurement and link occupancy calculation and the delay caused by this. There is no widely used standard for user-defined constraints, and even the SDN northbound interface (between the application layer and the control layer) as the basis for its realization has not yet a unified or recognized standard. Although its research space is relatively large, its current practicability and research value are not great. (4) High latency. The solutions to the above three problems will introduce considerable delay pressure. In addition, in the face of large-scale networks, the current routing algorithms are obviously weak. It is necessary to consider an efficient forwarding path calculation algorithm to deal with it. Machine learning algorithms can usually extract traffic characteristics automatically, and do not rely on expert experience, so they are more efficient than traditional solutions in solving traffic scheduling problems. Literature 30 32 . In the traditional single-path transmission mode, all QoS-related network parameters can be used for link weight calculation and the optimal path can be selected. However, when we adopt multi-path transmission, we need to balance multiple paths, considering the efficiency of the path selection scheme and the success rate of scheduling. Therefore, in order to ensure the efficiency of the path selection scheme, we combine the deep reinforcement learning algorithm to use the packet loss and delay parameters for link weight calculation. In order to improve the scheduling success rate of the scheme, we use the link capacity parameter to calculate the traffic forwarding path.
The main research content is as shown in Fig. 1. The paper structure is as follows: (1) In section two, we propose a link weight calculation algorithm based on the idea of traction link and deep reinforcement learning, and conduct related experiments to verify the effectiveness of traction link. This algorithm provides QoS guarantee and alleviates the problems of easy congestion and high delay. (2) In section three, we propose a traffic scheduling algorithm based on link weight and multi-path scheme, which also considers QoS requirements and solves the problem of single path. (3) In section four, we combine the algorithms proposed in section two and section three to implement a QoS-oriented global multi-path traffic scheduling algorithm for SDN, or QOGMP for short. We conduct comparative experiments in the built simulation test environment. The experimental results show that the performance of QOGMP is better than that of the algorithms for comparison.

Link weight calculation algorithm
We design a link weight calculation algorithm based on the idea of traction link and deep reinforcement learning.
We use deep reinforcement learning algorithms to calculate link weights. The agent of the reinforcement learning system is set as a neural network (strategy generation network), and its interaction with the environment is modeled as a Markov process. The Markov process is represented by a four-tuple E =< X, A, P, R > , where the probability P defaults to 1, the state of the environment x ∈ X is the current traffic view, the action a ∈ A is the link weight value, and the reward r ∈ R is the strategic value feedback provided by the environment for the neural network. By constantly trying actions, the strategy π is obtained, and the action to be executed a = π(x) can be known in the state x. The quality of the strategy can be measured by the value function.
The strategy generation network is implemented by the neural network a = π(x|µ) , and its parameter is µ . The strategic value network is realized by the neural network Q(x t , a t |θ) , and its parameter is θ.
The value function of the strategy is expressed as Eq. (1).
The reward function of the strategy is expressed as Eq. (2).
The parameters of the strategy value network and the strategy generation network need to be continuously adjusted based on the error of value and reward.
The parameter update of the strategic value network is in accordance with Eq. (3).
(2) y t = r t + γ Q(x t , π(x t+1 )|θ). The input of the strategy generation network is the traffic view and reward, and the output is the link weight value we need. The traffic view is a summary of the link information collected and calculated by the data layer, including information such as nodes, links, and the cost of each link. Since different QoS requirements have different requirements for packet loss and delay, we express the link cost as Eq. (5).
The flow of link weight calculation algorithm based on traction link and deep reinforcement learning is shown in Algoritnm 1. Initialize Noise//Add random noise to the action to improve the strategy exploration effect 5: for t = 1 to step do 6: a t = π(x t |µ)+ Noise 7: Receive r t and x t+1 from the environment 8: Update the parameter θ according to Eq. (3) 9: Update the parameter µ according to Eq. (4) 10: end for 11: end for 12: Output: Link weight a Assuming that the number of network nodes is n, the maximum number of optional links increases exponentially, as shown in Eq. (6).
Although deep reinforcement learning algorithms have strong computing power, in the face of such a huge amount of data, we still need to consider reducing the number of optional links. We adopt the idea of traction link proposed in literature 33 to alleviate this problem. Traction control theory points out that for the control of large-scale networks, it is only necessary to apply control signals to some nodes, and realize the diffusion of control signals through the connection relationship between nodes, and finally realize the coordination of the whole network, so as to achieve the control goal. For example, if there are 100 original links, the original method is to update the weight of 100 links, and finally get the path scheme L. Now, we extract 20 traction links from the original links, and update the weights of these 20 links. According to the traction control theory, the final path scheme is still L with great probability.
In the weight calculation phase, we replace original links with traction links included in original links to achieve the goal of not affecting the path selection result after the weight update, but greatly reducing the number of links to be processed.
Taking the link graph collected by the data layer as input, the flow of the traction link extraction algorithm is shown in Algoritnm 2.

Scientific Reports
Move v j out of V undone and add to V doing 14: end if 15: end if 16: end for 17: We use the traction link graph output by Algorithm 2 as the input of Algorithm 1.

Traffic scheduling algorithm
We design a global multi-path traffic scheduling algorithm based on link weight and ECMP. Our ultimate goal is to generate a traffic scheduling scheme, that is, to calculate the traffic forwarding path and the traffic distribution on each path.
Different from the single-path scheme, the multi-path scheme may have uneven traffic distribution, resulting in low link utilization or even close to congestion, resulting in high delay and low network throughput, so that the scheduling scheme is unsuccessful and enters rescheduling. Therefore, after we update the link weight, we solve this problem by balancing the link capacity and service flow of multiple alternative paths. The link capacity and service flow data are collected by the data layer and fed back to the control layer.
How many paths do we need for traffic matching? Is it all paths? Of course not, this answer needs to be studied on the ECMP scheme to get it. The ECMP scheme is a general multi-path traffic scheduling scheme at present, which can map a single flow to multiple paths. In general, if a flow is mapped to too many paths, the delay is low but the link utilization is too low. Conversely, if a flow is mapped to too few paths, the link utilization is high but the delay is too high. We need to explore how many links a flow can be mapped to achieve the optimal compromise between link utilization and delay. To this end, we did a simple experiment.
We assume that in a network environment, the service flow is fixed at 1 Mb and evenly distributed, and the delay is only the transmission delay. All links between the source address and the destination address are black boxes, and only the transmission rate and number of the link are known. We define a standard for the compromise between link occupancy and delay. The value of the performance compromise is equal to the product of each link occupancy divided by the longest delay in all links. The larger the value, the better the result. The definition of each link occupancy is the link delay divided by the longest delay in all links. The definition of the link delay is the service traffic allocated to the link divided by the transmission rate of the link.
When the number of links is 3, the performance compromise is calculated for different link mapping schemes and transmission rates, and the results are shown in Table 1. The number of mapped links is n, that is, the first n links are selected to calculate the performance compromise.
It can be drawn from Table 1 that the number of mapped links with the best performance compromise is 2 when the number of links is 3.
The above experimental process is a calculation from the original number of links to the number of mapped links with the best performance compromise. We perform similar calculations and records for different numbers of links, and summarize the number of mapped links with the best performance compromise, as shown in Table 2.
It can be obtained from Table 2 that the number of mapped links with the best performance compromise is [ √ n] when the number of mapped links is n. Therefore, we take [ √ n] paths for traffic distribution. There are two traffic ratio schemes: the first is to configure the ratio according to the weight on the premise that the link capacity reaches a certain requirement (to ensure that the delay is acceptable). The second is to simply allocate according to the margin ratio. The definition of the margin ratio is the capacity of each path to be allocated divided by the total capacity of all paths to be allocated. The capacity of each path is the minimum value of the capacity of all links on the path.
12: 13: if S i /c i > D then 14: break 15: end if 16: end for 17: We update the link weight according to the output of Algorithm 1. If the weight is not updated, the default is 1. After calculating the [ √ n] weighted shortest paths iteratively using Dijkstra's algorithm (execute Dijkstra algorithm to find the first shortest path from the source point to the destination point, remove the first one and execute the algorithm again to find the second shortest path, and iterate until [ √ n] paths are found), we execute Algorithm 3 to generate a traffic scheduling scheme.

Simulation implementation and experiment
Simulation implementation. Pycharm can not only run python algorithms, but also create graphical interfaces. We use the editor pycharm-community-2019.1.1 to implement the algorithm and create a simulation environment.  Table 2. This is a record table, where n is the number of mapped links and b is the number of mapped links with the best performance compromise. www.nature.com/scientificreports/ We combine the algorithms in section two and section three to implement the complete flow scheduling algorithm QOGMP, which is implemented in the order of Algorithm 2, Algorithm 1, and Algorithm 3. We perform a simplified simulation of the SDN network system. The simplified network system is divided into two layers: the control layer and the data layer. The controller of the control layer has the function of receiving information from the data layer and the calculation function of the traffic scheduling scheme. The data layer has the function of data collection and flow forwarding. Data transmission is allowed between the two layers.
In the order of execution, the specific functional design ideas are as follows: (1) Data collection function of the data layer: The data layer collects network information, including switch V, link E and link parameters (cost C, link capacity W). (2) The function of the controller to receive information from the data layer: The controller obtains the traffic view G(V, E, C, W) fed back from the data layer. (3) The function of the controller to calculate the flow scheduling scheme: We embed the traffic scheduling algorithm into the controller as the main algorithm of the controller. We take the traffic views G1(V, E, C) , G2(V, E, W) and user requirements (including business flow and delay tolerance) as input. After the main algorithm is executed, the traffic scheduling scheme is output. (4) Flow forwarding function of the data layer: The data layer receives the scheme generated by the controller and forwards the flow according to it (the stage task is to calculate the transmission delay).

Verification experiment of traction link.
We conduct related experiments to verify the effectiveness of traction link. After implementing Algorithm 2, we record the amount of calculation saved after applying the traction link algorithm to verify whether the extraction of the traction link can greatly reduce the number of links.
We conduct experiments on 20 link graphs G = (V , E) with different link numbers. Assuming that there are n nodes in V, the input format of the link graph (that is, the content of the data set) is a n × n numerical matrix, e(i, j) = 1 indicates that there is a link between node i and node j, e(i, j) = 0 means that there is no link between node i and node j. The data used in the experiment is randomly generated. We record the number of links in the input and output link graphs (the links between the same nodes are not recorded repeatedly), and the results are shown in Table 3 and Fig. 2.
It can be obtained from Table 3 and Fig. 2: After the application of the traction link algorithm, the saved calculation amount is up to 77 percent of the original data amount, at least 50 percent of the original data amount, and the average saved amount reaches 66 percent.
Therefore, the use of the traction link algorithm can greatly reduce the number of links that need to be processed without affecting the subsequent path selection results. It can save a large amount of calculation and improve the efficiency of the link weight calculation algorithm.
Comparative experiment. At present, the traditional method with the best performance (delay and rescheduling rate) is the QoS-oriented SDN global traffic scheduling algorithm proposed in literature 28,29 . The machine learning algorithm with the smallest delay is the SDN global multi-path traffic scheduling algorithm based on reinforcement learning proposed in literature 30 .  28,29 , QOGMP algorithm considers multi-path scheduling and has a higher link utilization rate. It uses machine learning algorithms to speed up the calculation of weights, which is suitable for big data environments. In contrast, QOGMP algorithm has obvious advantages, so it is no longer verified by comparison experiments.
We evaluate the performance of QOGMP on the built simulation system. Indicators for performance evaluation include delay and rescheduling rate. We compare QOGMP with the traffic scheduling algorithm that does not use traction links (that is, implemented in the order of Algorithm 1 and Algorithm 3, hereinafter referred to as pre-QOGMP) and the algorithm proposed literature 30 (hereinafter referred to as GMPRL).
Delay here refers to the algorithm running time. We carry out comparative experiments on the three algorithms. In order to reduce the experimental error, each experiment needs to be measured multiple times to record the shortest running time.
For the delay indicator, the comparative experiment is completed on 20 different traffic views, and the experimental result is shown in Table 4.
Plot Table 4 as Fig. 3. Analysis of Fig. 3 shows that: (1) the delay of QOGMP is always lower than that of pre-QOGMP; (2) the delay of QOGMP is not much different from that of GMPRL.
It can be seen from the experimental results that QOGMP is better than pre-QOGMP and is almost consistent with GMPRL in terms of delay indicator. (1) Since both QOGMP and GMPRL use machine learning algorithms, the delay of QOGMP is not much different from that of GMPRL; (2) QOGMP increases the step of pulling link extraction which leads to the increase of running time. But at the same time, the introduction of traction link greatly saves the amount of computation. So QOGMP outperforms pre-QOGMP in delay.  www.nature.com/scientificreports/ Since the three algorithms are all multi-path algorithms, the criteria used for rescheduling judgment are the same, that is, the sum of all path traffic in the solution generated by the algorithm is less than the service traffic or the transmission delay of a single path exceeds the delay tolerance.
For the indicator of rescheduling rate, we conduct 20 groups of comparative experiments, each of which was completed on 20-100 different data sets. The experimental result is shown in Table 5.
Plot Table 5 as Fig. 4. Analysis of Fig. 4 shows that: (1) The rescheduling rate of QOGMP is always not higher than GMPRL. In 17/20 cases, the rescheduling rate of QOGMP is lower than that of GMPRL, and in 3/20 cases, the rescheduling rate of QOGMP is the same as that of GMPRL. (2) The rescheduling rate of QOGMP is almost the same as that of pre-QOGMP, and the rescheduling rate of QOGMP is slightly higher than that of pre-QOGMP only in 1/20 cases.
It can be seen from the experimental results that QOGMP is better than GMPRL and is almost consistent with pre-QOGMP in terms of rescheduling rate indicator. (1) GMPRL does not consider QoS requirements, resulting in an increased probability that the traffic scheduling scheme is not suitable for service traffic, so QOGMP outperforms GMPRL in re-scheduling rate; (2) Since the introduction of traction links will not have a great impact on the final scheme, so QOGMP is almost identical to pre-QOGMP in rescheduling rate.
To sum up, compared with pre-QOGMP, QOGMP has lower delay and almost the same rescheduling rate; compared with GMPRL, QOGMP has lower rescheduling rate and almost the same delay. Therefore, our proposed QOGMP is better than GMPRL and pre-QOGMP for delay and rescheduling rate indicators.

Conclusion
We design and implement a QoS-oriented SDN global multi-path traffic scheduling algorithm, referred to as QOGMP. Aiming at the four problems currently existing in the research of SDN control layer traffic scheduling, QOGMP has adopted solutions: (1) QoS problem is solved by using the three QoS-related network parameters of packet loss, delay, and link capacity to generate traffic scheduling scheme; (2) The easy congestion problem is alleviated by utilizing the controller global view; (3) The high latency problem is alleviated by introducing traction links and deep reinforcement learning; (4) The single path problem is solved by multi-path scheduling. We carry out comparative experiments in the built simulation environment. The experimental results show that QOGMP has better performance than the two comparison algorithms in terms of delay and rescheduling rate. However, we still have room for improvement in terms of application scenario expansion, application algorithm improvement, and system details reproduction. The details are as follows.
(1) We only considered the three common QoS parameters of delay, packet loss and link capacity. However, the complex network environment also contains other QoS parameters such as jitter. We can conduct indepth research on QoS parameters and improve the algorithm proposed in this article in order to further cope with the complex network environment and business requirements. (2) As the neural network is not the innovative point and focus of the QOGMP algorithm, the neural network used in this article is the most basic. It can be replaced with an improved neural network. For example, in order to improve the update stability of neural network, we can use the target network method proposed in literature 34 . (3) The delay in the comparative experiment is up to 276 s, which is caused by hardware limitations. Later, the delay can be shortened to meet practical application requirements through Brax accelerator hardware 35 .